HPAVC

home *** CD-ROM | disk | FTP | other *** search

/ HPAVC / HPAVC CD-ROM.iso / NEUCLS3.ZIP / NURN.ZP / DATAMN.HLP < prev next >

Wrap

Text File | 1993-01-04 | 6KB | 124 lines

Outline 1. Files Needed or Produced by Software 2. Training and Testing Data a. IANS Format b. Data Files Included With This Package 3. File and Neural Net Limitations 1. Files Needed or Produced by Software a. MLP and functional link neural networks typically have three types of files associated with them. These three types are: (1) The network structure file. For the MLP, this file specifies the number of network layers, the number of artificial neurons (called units) in each layer, and the number of the first layer which the third and fourth (if there is one) layers connect to. For the functional link net, this file contains the network degree P (usually an integer between 1 and 5), the number of network inputs N and the number of outputs, and the dimension of the multinomial vector, which is L = (N+P)!/(N!P!). (2) The weight file, which gives the gains or coefficients along paths connecting the various units. (3) The training or testing data file, which gives example inputs and outputs for network learning, or for testing after learning. b. The network structure files have the extension "top". You can create your own network structure files within the backpropagation, fast training and functional link programs, if you want. Consider the MLP structure file, Grng.top shown below. 4 16 20 10 4 1 1 1 It has 4 layers. The first layer has 16 inputs, which means that each training or testing pattern has 16 numbers. It has 20 units in the first hidden layer, where "hidden" means that it is not an input or output layer. It has 10 units in the second hidden layer. The output layer has 4 units, corresponding to the 4 possible decisions that the network can make about the 16 input numbers. The last line of "1s" means that layers 2, 3, and 4 connect up with layer 1, layers 1 and 2, and layers 1, 2, and 3 respectively. This network is "fully connected", meaning that each layer connects with all previous layers. Fully connected networks are more powerful than and train faster than non fully connected networks. The fully connected networks are almost always smaller than non fully connected networks which perform the same operation. 2. Training and Testing Data a. IANS Format All data files must be put into formatted, IANS form, which means that each pattern or feature vector is followed by the correct class number (class id). The data analysis and pre-processing option (number 2) puts raw data into the IANS format. You can type out the data files to examine them, and you can use these files with other neural net software. For example, consider the training data file, Xor, shown below. 0. 0. 1 0. 1. 2 1. 0. 2 1. 1. 1 There are four training patterns with two inputs each. Patterns 1 and 4 belong to class 1, as indicated by the class number, 1, at the end of the first and last rows. The middle two rows or training patterns belong to class 2. The software can convert this file into either of two forms during training or testing. If the network is to have coded output, the number of output units is about log to the base 2 of the number of classes. For the file Xor for example, log2(2) = 1, so topology file Xor.top has 1 output unit. If we were to use file Xor to train a network having uncoded outputs, then the number of outputs would equal the number of classes. Since Xor, Par, and Par4 are used in coded output networks in our demos, the corresponding networks have one output each. Since file Grng has four classes and is used in uncoded output networks, the networks for it have four outputs. Internal to the software, file XOR is converted to the form 0. 0. 0 0. 1. 1 1. 0. 1 1. 1. 0 if the coded format is specified, where the third row stores the desired outputs for the corresponding patterns. If uncoded format is used, the file would be converted to the form 0. 0. 0 1 0. 1. 1 0 1. 0. 1 0 1. 1. 0 1 For pattern number 1 for example, the first output unit has the desired value of 0, corresponding to class 1. b. Data Files Included With This Package The XOR data file, which corresponds to exclusive or, has 4 patterns, 2 classes, and 2 inputs. The PAR4 data file, which corresponds to 4-input parity check, has 16 patterns, 2 classes, and 4 inputs. The GRNG data file, which corresponds to recognition of 4 geometric shapes, has 4 classes, 800 vectors or patterns, and 16 inputs. The Gongtrn data file, which corresponds to recognition of handprinted numerals, has 10 classes, 3,000 vectors or patterns, and 16 inputs or features calculated from 32 by 24 pixel binary images. 3. File and Neural Net Limitations There is no limitation on data file size. MLP neural nets are limited to 40 or fewer units in each layer, including the input layer, one or two hidden layers, and the output layer. Functional link networks are limited to 40 inputs, 15 outputs, and 5th degree. Conventional clustering and self-organizing map clustering are limited to 32 elements per vector and 2,048 clusters. There is no limit on the number of input patterns.